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METHOD FOR EVALUATION OF POLYMORPHIC GENETIC SEQUENCES, 
AND THE USE THEREOF IN IDENTIFICATION OF HLA TYPES 

PES CR1 PTON 

3ACKGROUNP QF TH5 INVENTION 

Genetic testing to determine the presence of or a susceptibility to a disease condition 
offers incredible opportunities for improved medical care, and the potential for such testing 
increases almost daily as ever increasing numbers of disease-associated genes and/or 
mutations are identified. A major hurdle which must be overcome to realize this potential, 
however, is the high cost of testing. This is particularly true in the case of highly 
polymorphic genes where the need to test for a large number of variations may make the 
test procedure appear to be so expensive that routine testing can never be achieved. 

Testing for changes in DNA sequence can proceed via complete sequencing of a target 
nucleic acid molecule, although many persons in the art believe that such testing is too 
expensive to ever be routine. Changes in DNA sequence can also be detected by a 
technique called 'single-stranded conformational polymorphism" ( "SSCP") described by 
Orita et al. 4 Genomics 5: 874-879 (1989), or by a modification thereof referred to a 
dideoxy-fingerprinting ("ddF") described by Sarkar et al, Genomics 13: 4410443 (1992). 
SSCP and ddF both evaluate the pattern of bands created when DNA fragments are 
electrophoretically separated on a non-denaturing electrophoresis gel. This pattern depends 
on a combination of the size of the fragments and of the three-dimensional conformation of 
the undenatured fragments. Thus, the pattern cannot be used for sequencing, because the 
theoretical spacing of the fragment bands is not equal. 

The hierarchical assay methodology described in US Patent No. 5,545,527 and 
International Patent Publication No. WO 96/07761, which are incorporated herein by 
reference, provides a mechanism for systematically reducing the cost per test by utilizing a 
series of different test methodologies which may have significant numbers of results 
incorrectly indicating the absence of a genetic sequence of interest, but which rarely if ever 
yield a result incorrectly indicating the presence of such a genetic sequence. The tests 
employed in the hierarchy may frequently be combinations of different types of molecular 
tests, for examples combinations of immunoassays, oligonucleotide probe hybridization 
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tests, oligonucleotide fragment analyses, and direct nucleic acid sequencing. This 
application relates to a particular type of test which can be useful alone or as part of a 
hierarchical testing protocol, particularly for highly polymorphic genes. A particular 
example of the use of this test is its application to determining the allelic type of human 
HLA genes, although the test is applicable to many genes of known sequence, and the 
invention should not be construed as limited to HLA. 

Human HLA genes are part of the major histocompatability complex (MHC), a cluster 
of genes associated with tissue antigens and immune responses. Within the MHC genes are 
two groups of genes which are of substantial importance in the success of tissue and organ 
transplants between individuals. The HLA Class I genes encode transplantation antigens 
which are used by cytotoxic T cells to distinguish self from non-self The HLA class II 
genes, or immune response genes, determine whether an individual can mount a strong 
response to a particular antigen. Both classes of HLA genes are highly polymorphic, and in 
fact this polymorphism plays a critical role in the immune response potential of a host. On 
the other hand, this polymorphism also places an immunological burden on the host 
transplanted with allogeneic tissues. As a result, careful testing and matching of HLA types 
between tissue donor and recipient is a major factor in the success of allogeneic tissue and 
marrow transplants. 

Typing of HLA genes has proceeded along two basic lines: serological and nucleic 
acid-based. In the case of serological typing, antibodies have been developed which are 
specific for certain types of HLA proteins. Panels of these tests can be performed to 
evaluate the type of a donor or recipient tissue. In nucleic acid based-approaches, samples 
of the HLA genes may be hybridized with sequence-specific oligonucleotide probes to 
identify particular alleles or allele groups. In some cases, determination of HLA type by 
sequencing of the HLA gene has also been proposed. Santamaria P, et al M HLA Class I 
Sequence-Based Typing", Human Immunology 37: 39-50 (1993) 

In all of these cases, the test pane) performed on each individual sample is extensive, 
with the result that the cost of HLA typing is very high. It would therefore be desirable to 
have a method for typing HLA which provided comparable or better reliability at 
substantially reduced cost. It is an object of the present invention to provide such a 
method. 
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SUMMARY OF THE INVENTION 

The method of the invention makes use of a modification of standard sequencing 
technology, preferably in combination with improved data analysis capabilities to provide a 
streamlined method for obtaining information about the allelic type of a sample of genetic 
material. Thus, in accordance with the invention, the allelic type of a polymorphic genetic 
locus in a sample is identified by first combining the sample with a sequencing reaction 
mixture containing a template-dependent nucleic acid polymerase. A, T, G and C 
nucleotide feedstocks, one type of chain terminating nucleotide and a sequencing primer 
under conditions suitable for template dependant primer extension to form a plurality of 
oligonucleotide fragments of differing lengths, and then evaluating the length of the 
oligonucleotide fragments. As in a standard sequencing procedure, the lengths of the 
fragments can be evaluated on a denaturing gel, such that the actual length of each 
fragment, independent of conformational changes that may be caused by sequence 
variations is determined. The observed bands therefore indicate the positions of the type of 
base corresponding to the chain terminating nucleotide in the extended primer. The method 
of the invention differs from standard sequencing procedures, however, because instead of 
performing and evaluating four concurrent reactions, one for each type of chain terminating 
nucleotide, in the method of the invention the sample is concurrently combined with at 
most three sequencing reaction mixtures containing different types of chain terminating 
nucleotides. Preferably, the sample will be combined with only one reaction mixture, 
containing only one type of chain terminating nucleotide and the information obtained from 
this test will be evaluated prior to performing any additional tests on the sample. 

In many cases, evaluation of the positions of only a single base will allow for allelic 
typing of the sample. In this case, no further tests need to be performed. Thus, the use of 
the method of the invention can increase laboratory throughput (since up to four times as 
many samples can be processed on the same amount of equipment) and reduce the cost per 

4 

test by up to a factor of four compared to sequencing of all four bases for every sample 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows the application of the invention to typing of a simple polymorphic gene; 
Fig. 2 illustrates an improved method for distinguishing heterozygotic alleles using the 
present invention; 

Fig. 3 illustrates a situation in which heterozygote pairs remain ambiguous even after 
full sequencing; 

Fig 4 illustrates the use of a control lane to evaluate the number of intervening bases in 
a single base sequencing reaction; 

Fig. 5 shows results from an automated DN A sequencing apparatus; 

Fig. 6 illustrates peak-by-peak correlation of sequencing results; 

Fig. 7 shows a plot of the maxima of each data peak plotted against the separation 
from the nearest other peak; and 

Figs. 8A-8C illustrate the application of the invention to typing of Chlamydia 
trachomatis. 

DETAILED DESCRIPTION QF THE INVENTION 

While the terminology used in this application is standard within the art, the following 
definitions of certain terms are provided to assure clarity. 

1 . "Allele" refers to a specific version of a nucleotide sequence at a polymorphic genetic 
locus. 

2. "Polymorphism" means the variability found within a population at a genetic locus. 

3. "Polymorphic site" means a given nucleotide location in a genetic locus which is 
variable within a population. 

4. "Gene" or "Genetic locus" means a specific nucleotide sequence within a given 
genome. 

5. The "location" or "position" of a nucleotide in a genetic locus means the number 
assigned to the nucleotide in the gene, generally taken from the cDN A sequence or the 
genomic sequence of the gene. 

6. The nucleotides Adenine, Cytosine, Guanine and Thymine are sometimes represented 
by their designations of A, C, G or T, respectively. 
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While it has long been apparent to persons skilled in the art that knowledge of the 
identity of the base at a particular location within a polymorphic genetic locus may be 
sufficient to determine the allelic type of that locus, this knowledge has not led to any 
modification of sequencing procedures. Rather, the knowledge has driven development of 
techniques such as allele-specific hybridization assays, and allele-specific ligation assays. 
Despite the failure of the art to recognize the possibility, however, it is not always 
necessary to determine the sequence of all four nucleotides of a polymorphic genetic locus 
in order to determine which allele is present in a specific patient sample. Certain alleles of a 
genetic locus may be distinguishable on the basis of identification of the location of less 
than four, and often only one nucleotide. This finding allows the development of the 
present method for improved allele identification at a polymorphic genetic locus 

A simple example is to consider a polymorphic site for which only two alleles are 
known, as in Figure 1 . In this case, identification of the location of the A nucleotides in the 
genetic locus, particularly at site 101, will distinguish whether allele 1 or allele 2 is present. 
If a third allele was discovered which had a C at site 101, the presence of the allele could be 
distinguished either by the absence at site 101 of an A and a T in independent A and T 
reactions ql by the presence of a C at site 101. 

Traditionally, if sequencing were going to be used to evaluate the allelic type of the 
polymorphic site of Fig. 1, four dideoxy nucleotide "sequencing" reactions of the type 
described by Sanger et al. (Proc. Natl. Acad. Sci. USA 74: 5463-5467 (1977)) would be 
run on the sample concurrently, and the products of the four reactions would then be 
analyzed by polyacrylamide gel electrophoresis, (see Chp 7.6, Current Protocols in 
Molecular Biology, Eds. Ausubel, F.M. et al, (John Wiley & Sons, 1995)) In this well 
known technique, the each of the four sequencing reactions generates a plurality of primer 
extension products, all of which end with a specific type of dideoxy-nucleotide. Each lane 
on the electrophoresis gel thus reflects the positions of one type of base in the extension 
product, but does not reveal the order and type of nucleotides intervening between the 
bases of this specific type. The information provided by the four lanes is therefore 
combined in known sequencing procedures to arrive at a composite picture of the sequence 
as a whole. 
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In accordance with the present invention, however, single sequencing reactions are 
performed and evaluated independently to provide the number of intervening bases between 
each instance of a selected base and thus a precise indication of the positional location of 
the selected base. Applying the method of the invention to the simplistic example of Fig. I, 
a single sequencing reaction would first be performed using either dideoxy~A or dideoxy-T 
as the chain terminating nucleotide. If the third allelic type did not exist or was unknown, 
this single test would be enough to provide a specific result. If the third allelic type was 
known to exist and the base present in the sample was not identified by the first test, a 
second sequencing test could be performed using either dideoxy-C or the dideoxy-A/T not 
used in the first test to resolve the identity of the allelic type . Alternatively, some other test 
such as an allele-specific hybridization probe or an antibody test which distinguished well 
between allele 1 or 2 and allele 3 could be used in this case. 

As is clear from this example, the method of the invention specifically identifies 
"known" alleles of a polymorphic locus, and is not necessarily usefiil for identification of 
new and hitherto unrecorded alleles. An unknown allele might be missed if it were 
incorrectly assumed that the single nucleotide sequence obtained from a patient sample 
corresponded to a unique allele, when in fact other nucleotides of the allele had been 
rearranged in a new fashion. The method is specific for distinguishing among known alleles 
of a polymorphic locus (though it may fortuitously come across new mutations if the right 
single nucleotide sequence is chosen). Databases listing known alleles must therefore be 
continually updated to provide greatest utility for the invention. 

The advantages of "less than 4 M nucleotide analysis of the invention for identifying 
alleles are the decrease in costs for reagents and labor and the increased throughput of 
patient samples that can be obtained in a diagnostic laboratory. These advantages can be 
more dramatically demonstrated by considering a system which more closely approximates 
a real world example. For this purpose, we have assumed a population in which only the 
known HLA Class II DR4 alleles exist (of these, 5 alleles DRB 1*0401, DRB1 *0402, 
DRB 1 *0405, DRB 1 *0408, and DRB 1 *0409 are found in 95% of the North American 
population), and in which these alleles are always homozygous. 

To determine the order in which the single nucleotide sequences should be performed, 
the sequence differences among alleles are evaluated to determine which of the bases will 



WO 97/23650 PCTYUS96/20202 

-7- 

yield the most information, and the circumstances in which knowledge about two or more 
bases yields a definitive typing. To do this, we look first at base A, for example, to 
determine which alleles can be identified unequivocally from a knowledge of the position of 
the A bases within the sample. One way to approach this is to set up a table which shows 
the base for each allele at each polymorphic site, as shown in Table 1, and to determine the 
pattern which would be observed if the A's in the table were detected. Each unique pattern 
can be definitively typed using this one sequencing reaction. For the DR4 alleles, every 
allele (including all of the most widely distributed alleles) except DRBl *0413 and 
DRB 1 *04 1 6 produces a unique patten. All of the other bases effectively identify fewer 
allelic types, and therefore the A reaction is done first. Further, it is very likely that any 
given group of samples could be entirely typed using this single sequencing reaction. In the 
event that samples were not definitively typed using this first sequencing reaction, any 
second sequencing reaction performed on the untyped samples would distinguish between 
DRBl*0413and DRBl*0416 

The significance in terms of cost per test of using the method of the invention is 
easily appreciated. Determining the DR4 allelic type of 100 samples using traditional 4 
nucleotide DN A sequencing requires performance of a total of 400 sequencing reactions. 
Assuming a cost (reagents plus labor) of $20.00 per test, this would result in a cost per 
patient of $80.00. In contrast, in the test using the method of the invention, only the first 
test for the positions of A is performed on all samples. Even assuming the statistically 
unlikely event that 5% of the samples are of type DRB 1*04 13 or DRB 1*04 16, 95 positive 
typings will result. The remaining 5 samples are tested using a second (G, C or T) 
sequencing reaction, with the result that all 5 samples are definitively typed. Thus, the cost 
for performing these 100 typings using the method of the invention is $2,100 or $21 per 
patient. 
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In some cases, the second sequencing reaction performed may not yield unique 
patterns for all of the samples tested. In this case, prior to performing a third sequencing 
reaction, it is desirable to combine the results of the first two sequencing reactions and 
evaluate these composite results for unique base patterns. Thus, for example, a first and 
second sequencing reaction may have four alleles which can be characterized as follows 



A pattern T pattern 

Allele 1 1 3 2 2 2)1 

Allele 2 13 2 2 4 11 

Allele 3 3 4 2 2 2 11 

Allele 4 3 4 2 2 3 11 



Allele 2 and Allele 4 give unique results from the T-sequence reaction alone, and can 
therefore be typed based upon this information. Alleles I and 3, however have the same T- 
sequencing pattern. Because these two allele have different A-sequencing reaction patterns, 
however, they are clearly distinguishable and can be typed based upon the combined 
patterns without further testing. 

This substantial reduction in the number of sequencing reactions means that the cost of 
reagents and labor required to perform the reactions is reduced. Further, since each sample 
must be analyzed by electrophoresis, fewer electrophoresis runs need to be performed. For 
example, in an automated DNA sequencer having 40 lanes, such as the Pharmacia A.L.F .™ 
(Pharmacia, Uppsala, Sweden), up to 40 patient samples can be run on a gel rather than 10 
patient samples using 4 lanes each. In systems such as the Applied Biosystems Inc. 377™, 
(Foster City, CA) which permit the use of 4 fluorescent dyes per lane, 4 patient samples 
may be run per lane instead of one patient sample per lane. Use of networked high-speed 
DNA sequencers with software that can combine data taken from different instruments, 
such as the M1CROGENE BLASTER™ sequencer and GENE OBJECTS™ software, 
(both part of the OPEN GENE™ System available from Visible Genetics Inc., Toronto, 
Canada) can also enhance this method. 

This same methodology can be applied to virtually any known polymorphic genetic 
locus to obtain efficient characterization of the locus. For example, identification of alleles 
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in the highly polymorphic Human Leukocyte Antigen (HLA) gene system (Parham, P. et al. 
"Nature of Polymorphism in HLA-A, -B and -C Molecules", Proc Natl. Acad. Sci., USA 
85: 4005-4009 (1988)) will benefit greatly from the method. Moreover, the method is not 
limited to human polymorphisms. It may be used for other animals, plants, bacteria, viruses 
or fungi It may be used to distinguish the allelic variants present among a mixed sample of 
organisms In human or animal diagnostics, the method can be used to identify which 
subspecies of bacteria or viruses are present in a body sample. This diagnosis could be 
essential for determining whether drug-resistant strains of pathogens are present in an 
individual. 

After developing an assay methodology in the manner outlined above for a particular 
known polymorphic gene, the first step of the method of the invention is obtaining a 
suitable sample of material for testing using this methodology. The genetic material tested 
using the invention may be chromosomal DN A, messenger RNA, cDNA, or any other form 
of nucleic acid polymer which is subject to testing to evaluate polymorphism, and may be 
derived from various sources including whole blood, tissue samples including tumor cells, 
sperm, and hair follicles. 

In some cases, it may be advantageous to amplify the sample, for example using 
polymerase chain reaction (PCR) amplification, to create one which is enriched in the 
particular genetic sequences of interest. Amplification primers for this purpose are 
advantageously designed to be highly selective for the genetic locus in question. For 
example, for HLA Class I testing, group specific and locus specific amplification primers 
have been disclosed in US Patent No. 5,424,184 and Cereb et al., "Locus-specific 
amplification of HLA class I genes from genomic DNA: locus-specific sequences in the 
first and third introns of HLA-A, -B and -C alleles." Tissue Antigens 45:1-1 1 (1995) which 
are incorporated herein by reference. 

Once a suitable sample is obtained, the sample is combined with the first sequencing 
reaction mixture. This reaction mixture contains a template-dependent nucleic acid 
polymerase, A, T, G and C nucleotide feedstocks, one type of chain terminating nucleotide 
and a sequencing primer. 

The selection of the template-dependent nucleic acid polymerase is not critical to the 
success of the invention. A preferred polymerase, however, is Thermo Sequenase™, a 
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thermostable polymerase enzyme marketed by Amersham Life Sciences. Other suitable 
enzymes include regular Sequenase™ and other enzymes used in sequencing reactions. 

Selection of appropriate sequencing primers is generally done by finding a part of the 
gene, either in an intron or an exon, that lies near (within about 300 nt) the polymorphic 
region of the gene which is to be evaluated, is 5' to the polymorphic region (either on the 
sense or the antisense strand), and that is highly conserved among all known alleles of the 
gene. A sequencing primer that will hybridize to such a region with high specificity can then 
be used to sequence through the polymorphic region. Other aspects of primer quality, such 
as lack of palindromic sequence, and preferred G/C content are identified in the US Patent 
No 5,545,527. 

In some cases it is impossible to select one primer that can satisfy all the above 
demands. Two or more primers may be necessary to test among some sub-groups of a 
genetic locus. In these cases it is necessary to attempt a sequencing reaction using one of 
the primers. If hybridization is successful, and a sequencing reaction proceeds, then the 
results can be used to determine allele identity. If no sequencing reactions occur, it may be 
necessary to use another one of the primers. 

The sequencing reaction mixture is processed through multiple cycles during which 
primer is extended and then separated from the template DNA from the sample and new 
primer is reannealed with the template. At the end of these cycles, the product 
oligonucleotide fragments are separated by gel electrophoresis and detected. This process 
is well known in the art. Preferably, this separation is performed in an apparatus of the type 
described in US Patent Application No. 08/353,932, the continuation in part thereof filed 
on December 12, 1995 as International Patent Application No. PCT/US95/ 15951 using thin 
microgels as described in International Patent Application No. PCT/US95 14531, all of 
which applications are incorporated herein by reference. 

The practice of the instant invention is assisted by technically advanced methods for 
precisely identifying the location of nucleotides in a genetic locus using single nucleotide 
sequencing. The issue is that in the technique of single nucleotide sequencing using 
dideoxy-sequencing/ electrophoresis analysis it is sometimes a challenge to determine how 
many nucleotides fall between two of the identified nucleotides: 
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A AA or A AA 

In many cases, there is little difficulty, particularly when short sequencing reaction products 
are examined (200 nt or less), because the electrophoretic separation of reaction products 
follows a highly predictable pattern. A computer or a human can easily determine the 
number of nucleotides lying between two identified nucleotides by simply measuring the 
gap and determining the number of singleton peaks that would otherwise fall in the gap. 
The problem becomes relevant in longer electrophoresis runs where resolution and 
separation of sequencing reaction fragments is lost. In addition, loss of consistency in 
maintaining the temperature, electric field strength or other operating parameters can lead 
to inconsistencies in the spacing between peaks and ambiguities in interpretation. Such 
ambiguities can prevent accurate identification of alleles. 

One simple way to resolve these problems is to run a "control" lane with all samples 
which identifies all possible nucleotide fragment lengths from the genetic locus being 
sequenced, for example by performing a reaction which includes all 4 dideoxy nucleotides. 
The control lane indicates precisely the number of nucleotides that lie in the gaps between 
the identified nucleotides, as in Fig. 3. 

Any sequencing format can use such a control lane, be it "manual" sequencing, using 
radioactively labeled oligonucleotides and autoradiograph analysis (see Chp 7, Current 
Protocols in Molecular Biology, Eds. Ausubel, F.M. et al, (John Wiley & Sons; 1995)), or 
automated laser fluorescence systems. 

An improved method for identifying alleles, which does not rely on measuring the 
number of nucleotides lying between two identified nucleotides is disclosed in US Patent 
Application Serial No. 08/497,202. Briefly, this method relies on the actual shape of the 
data signal ("wave form") received from an automated laser fluorescence DN A analysis 
system. The method compares the patient sample wave form to a database of wave forms 
representing the known alleles of the gene. The known wave form that best matches the 
sample wave form identifies the allele in the sample. 

A further embodiment of the invention which may be applied in some cases, including 
HL A typing, to further expedite and reduce the expense of testing, involves the 
simultaneous use of two chain terminating nucleotides in a single reaction mixture. For 
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example, a single reaction containing a mixture of ddATP and ddCTP could be performed 
initially. The peaks observed on the sequencing gel are either A or C, and cannot be 
distinguished (unless dye-labeled terminators with different labels are used). In some cases, 
however, this information is sufficient to identify the nature of the allele. For example, in 
the simple three allele case shown in Fig. 1, the sequence information would identify the T 
allele unambiguously. For more complicated polymorphic genes, a second sequencing run, 
including two chain terminating nucleotides, one being the same as one included in the first 
reaction and the other being different from those included in the first reaction mixture. 
These two sequencing procedures permit determination of the position of three bases 
expressly and the fourth base by difference in a total of only two reactions. 

As discussed below, some wave forms may represent heterozygote mixtures. The 
database should include wave forms from all known heterozygote combinations to ensure 
that the matching process includes the full variety of possibilities. When a patient sample is 
found to be a possible heterozygote, the software can be designed to inform the user of the 
next analytical test that should be performed to help distinguish among possible allelic 
members of the heterozygote. 

Heterozygous polymorphic genetic loci need special consideration. Where more than 
one variant of the same loci exists in the patient sample, complex results are obtained when 
single lane sequencing begins at a commonly shared sequencing primer site. This problem 
is also found in traditional 4 lane sequencing (see Santamaria P, et al "HLA Class I 
Sequence-Based Typing" Human Immunology 37, 39-50 (1993)). However, Figure 2 
illustrates an improved method for distinguishing heterozygotic alleles using the present 
invention. 

The problem presented by a heterozygous allele is illustrated in Fig. 2a. The observed 
data from single nucleotide sequencing of the A lane can not point to the presence of a 
unique allele. Either the loci is heterozygous or a new allele has been found. (For well 
studied genetic loci, new alleles will be rare, so heterozygosity may be assumed.) The 
problem flows from a mixture of alleles in the patient sample which is analyzed. For exam- 
ple, the observed data may result from the additive combination of allele I and allele 2. 

Where there are more than two possible alleles, it is necessary to compare each of the 
known allelic variants to the observed data to see if they could result in the observed data. 
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Each heterozygote pair will have its own distinct pattern. Fig 3b illustrates that alleles 3 
and 4 can not underlie the observed data because certain A nucleotides in those alleles are 
not represented in the data. They are thus eliminated from consideration. The remaining 
alleles 5, 6, and 7 could be used in combination with others to generate the observed data. 

In the case of human genomic DNA, only two alleles at any one loci can generally be 
present (one from each chromosome). It is necessary, therefore, to combine all known 
alleles to determine if they can be additively combined to result in the observed data. (In 
fact, the data appearance of known and hypothetical heterozygote pairs can be prepared 
and stored in an additional database to facilitate analysis.) In Fig 3b combination of alleles 
5 and 6 will result in the observed data, and combination of neither 5 & 7 nor 6 & 7 gives 
the desired result. Therefore, if only the alleles 3 to 7 were known, the only two that could 
possibly be combined to result in the observed data would be 5 and 6. Allelic identification 
could be made on this basis. 

In some cases, where more than one pair of alleles can be combined to obtain the 
observed data, as in Fig 3c, it is necessary to determine the relative locations of other 
nucleotides in order to distinguish which allelic pair is present. Identification of another 
specific type of nucleotide serves to distinguish which pair of alleles is present. Fig 3d 
shows further, that sometimes observed data may appear to be a homozygote for one allele, 
but in fact it may consist of a heterozygote pair, either including the suggested allele, or 
not. The alleles that might lead to such confusion, by masking possible heterozygotes, can 
be identified in the known allele database. Identification of these alleles can not be 
confirmed unless further tests are made which can confirm whether a heterozygote 
underlies the observed data. 

All of the analyses of comparing the known alleles to the observed data can be conven- 
iently assisted by the use of high speed computer analysis. 

In rare cases, such as in Fig. 4, sequencing of all 4 nucleotides will not permit identifi- 
cation of which allelic pair is present. The ambiguity may be reported as such, especially if 
the clinical need for distinguishing is low. Alternatively, high stringency hybridization 
probes may be used, as they can identify the presence of specific allelic variants. Protocols 
for hybridization probes are well known in the art (see Chp 6.4, Current Protocols in 
Molecular Biology, Eds. Ausubel, F.M. et al, (John Wiley & Sons; 1995)). 
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Occasionally, quantitative measurements of the amount of sequencing reaction 
products may be sufficient to distinguish whether only one allele has an A at a specific loci, 
or both. It is found experimentally, however, that quantitative analysis of sequencing peak 
heights can only rarely assist in the analysis. 

Quantitative analysis proves more useful for resolving the problem of "allelic dropout". 
In cases of allelic dropout, sequencing reactions identify an apparent homozygote, but only 
because the sequencing primer has failed to initiate sequencing reactions on one of the two 
alleles. This may have resulted from heterogeneity at the sequencing primer site itself, 
which prevents the primer from hybridizing to the target site or initiating chain extension. 
(This problem should be rare as sequencing primers according to the invention are designed 
to hybridize generally to highly conserved areas of the genome). 

Allelic dropout is resolved by amplifying both alleles from genomic DNA using 
quantitative polymerase chain reaction {see for example, Chp 1 5, Current Protocols in 
Molecular Biology, Eds. Ausubel, F.M et al, (John Wiley & Sons; 1995)) The sequencing 
primer is used as one of a pair of PCR primers. A fragment of DNA spanning the alleles in 
question is amplified quantitatively. At the end of the reaction, quantities of PCR products 
will be only half the expected amount if only one allele is being amplified. Quantitative 
analysis can be made on the basis of peak heights of amplified bands observed by 
automated DNA sequencing instruments. 

A plurality of pathogens can produce even more complex results from single 
nucleotide sequencing. The complexity flows from an unlimited number of variants of the 
pathogen that may be present in the patient sample. For example, viruses, and bacteria may 
have variable surface antigen coding domains which allow them to evade host immune 
system detection. To avoid this problem of variability, the genetic locus selected for 
examination is preferably highly conserved among all variants of the pathogen, such as 
ribosomal DNA or functionally critical protein coding regions of DNA. Where variable 
regions of the pathogen must be analyzed, an extended series of comparisons between the 
observed data and the known alleles can assist the diagnosis by determining which alleles 
are not substantial components of the observed data. 

The method of the present invention lends itself to the construction of tailored kits 
which provide components for the sequencing reactions. As described in the examples. 
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these components include oligonucleotide sequencing primers, enzymes for sequencing, 
nucleoside and dideoxynucleoside preparations, and buffers for reactions. Unlike 
conventional kits, however, the amount of each type of dideoxynucleoside required for any 
given assay is not the same. Thus, for an assay in which the A sequencing reaction is 
performed first and on all samples, the amount of dideoxy-A included in the kit may be 5 to 
10 times greater than the amount of the other dideoxynucleosides. 

The following examples are included to illustrate aspects of the instant invention and 
are not intended to limit the invention in any way. 



Example 1 

Identification of HLA Class II gene alleles present in an individual patient sample can 
be performed using the method of the instant invention. For example, DRB 1 is a 
polymorphic HLA Class II gene with at least 107 known alleles (See Bodmer et al. 
Nomenclature for Factors of the HLA System, 1994. Hum. 1mm. 41, 1-20 (1994)) 

The broad serological subtype of the patient sample DRB 1 allele is first determined by 
attempting to amplify the allele using group specific primers. 

Genomic DNA is prepared from the patient sample using a standard technique such as 
proteinase K proteolysis. Allele amplification is carried out in Class II PCR buffer: 
10 mMTrispH 8.4 
50 mM KCI 
1.5 mM MgC12 
0.1% gelatin 

200 uM each of d ATP, dCTP, dGTP and dTTP 
1 2 pmol of each group specific primer 
40 ng patient sample genomic DNA 

Groups are amplified separately. The group specific primers employed are: 



dr 1 

S'-PRIMER; TTGTGGCAGCTTAAGTTTGAAT 
.V-PR1MERS: CCGCCTCTGCTCCAGGAG 
CCCGCTCGTCTTCCAGGAT 



PRODUCT SIZE 
IScqIDNo. 1 1 I95&I96 
|Seq ID No. 21 
|Scq ID No. 3 1 



WO 97/23650 



17 



PCT/US96/20202 



DR2( 15 AND 16) 

.V-PRIMER: TCCTGTGGCAGCCTAAGAG 
3 -PRIMERS: CCGCGCCTGCTCCAGGAT 
AGGTGTCCACCGCGCGGCG 



|Scq ID No.4| 
|Scq ID No. 5 1 
IScq ID No. 6] 



197&213 



DR3.8.11. 12.13.14 

.V-PRIMER: C ACGTTTCTTGG AGTACTCTAC | Scq ID No. 71 270 

3'-PRIMER: CCGCTGCACTGTGAAGCTCT |Scq ID No. 81 



DR4 

5*-PRlMER: GTTTCTTGGAGCAGGTTAAACA |Scq ID No. 9] 260 

3'-PRlMERS: CTGCACTGTGAAGCTCTCAC |Seq ID No. 10| 

CTGC ACTGTG AAGCTCTCC A [Scq ID No. 1 1 1 



DR7 

V-PRIMER: CCTGTGGCAGGGTAAGTATA IScq ID No. 12] 232 

T-PRIMER: CCCGTAGTTGTGTCTGCACAC |Scq ID No. 13] 



DR9 

5'-PRIMER: GTTTCTTGAAGCAGGATAAGTTT |Scq ID No. 14] 236 
3'-PRIMER: CCCGTAGTTGTGTCTGCACAC [Scq ID No. 151 



DRIO 

5'-PRlMER: CGGTTGCTGGAAAGACGCG |Scq ID No. 16] 204 

3'-PRlMER: CTGCACTGTGAAGCTCTCAC [Scq ID No. 171 



The 5-primers of the above groups are terminally labelled with a fluorophore 
such as a fluorescein dye at the 5'- end. 

The reaction mixture is mixed well. 2.5 units Taq Polymerase are added and 
mixed immediately prior to thermocycling. The reaction tubes are placed in a Robocycler 
Gradient 96 (Stratagene, Inc.) and subject to thermal cycling as follows: 



1 cycle 94 C 2 min 
10 cycles 94 C 15 sec 
67 C lmin 
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94 C 10 sec 
61 C 50 sec 
72 C 39 sec 
72 C 2min 

4 C cool on ice until ready for electrophoretic analysis. 
Seven reactions (one for each group specific primer set) are performed. After 
amplification 2 uL of each of the PCR products are pooled, and mixed with II uL of 
loading buffer consisting of 100% formamide with 5 mg/ml dextran blue. The products are 
run on a 6% polyacrylamide electrophoresis gel in an automated fluorescence detection 
apparatus such as the Pharmacia A.L.F.™ (Uppsala, Sweden) Size determinations are 
performed based on migration distances of known size fragments. The serological group is 
identified by the length of the successfully amplified fragment. Only one fragment will 
appear if both alleles belong to the same serological group, otherwise, for heterozygotes 
containing alleles from two different groups, two fragments appear. 

Once the serological group is determined, specificity within the group is 
determined by single nucleotide sequencing according to the invention. 

Each positive group from above is individually amplified for sequence analysis. 
The PCR amplification primers are a biotinylated 3'-PRIMER amp B: 

(5' Biotin-CCGCTGCACTGTGAAGCTCT 3' ) [Seq ID No. 8] 

and the appropriate 5-PRIMER described above. The conditions for amplification are 
identical to the method described above. 

After amplification sequencing is performed using the following sequencing 

primer: 

5' - G AGTGTC ATTTCTTC AA [Seq ID No. 1 8] 



20 cycles 



1 cycle 



The PCR product (10 ul) is mixed with 10 ul of washed Dynabeads M-280 (as 
per manufacturers recommendations, Dynal, Oslo, Norway) and incubated for 1 hr at room 
temperature. The beads are washed with 50 ul of IX BW buffer (10 mM Tris, pH 7.5, 1 



WO 97/23650 



PCT/US96/20202 



- 19- 

mM EDTA, 2M NaCI) followed by 50 ul of 1 X TE buffer (10 mM Tris, I mM EDTA). 
After washing, resuspend the beads in 10 ul of TE and take 3 ul for the sequencing reaction 
which consists of: 
3 ul bound beads 

3 ul sequencing primer (30 ng total) 

2 ul 10X sequencing buffer (260 mM Tris-HCI, pH 9.5, 65 mM MgCI2) 

2 ul of Thermo Sequenase™ (Amersham Life Sciences, Cleveland) (diluted 1 : 10 from 
stock) 

3 ul H20 

Final Volume = 13 ul. Keep this sequencing reaction mix on ice. 

Remove 3 ul of the sequencing reaction mix and add to 3 ul of one of the 
following mixtures, depending on the termination reaction desired. 

A termination reaction: 
750 uM each of d ATP, dCTP, dGTP, and dTTP; 2.5 uM ddATP 
C termination reaction: 

750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddCTP 
G termination reaction: 

750 uM each of d ATP, dCTP, dGTP, and dTTP; 2.5 uM ddGTP 
T termination reaction: 

750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddTTP 
Total termination reaction volume: 6 ul 

Cycle the termination reaction mixture in a Robocycler for 25 cycles (or fewer if found to 
be satisfactory): 
95 C 30 sec 
50 C 10 sec 
-70 C 30 sec 



• 
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After cycling add 12 ul of loading buffer consisting of 100% formamide with S 
mg/ml dextran blue, and load appropriate volume to an automated DNA sequencing 
apparatus, such as a Pharmacia A L.F 

Allele identification requires analysis of results from the automated DNA sequencing 
apparatus as in Fig. 5. Fragment length analysis revealed that one allele of the patient 
sample was from the DR4 serological subtype (data not shown) Single nucleotide 
sequencing was then performed to distinguish among the possible DR4 alleles. Lane 1 
illustrates the results of single nucleotide sequencing for the "C" nucleotide of a patient 
sample (i.e. using the C termination reaction, above). Lanes 2 and 3 represent C nucleotide 
sequence results for 2 of the 22 known DR4 alleles. Similar results for the 20 other alleles 
are stored in a database. The patient sample is then compared to the known alleles using 
one or more of the methods disclosed in US Patent Application Serial No. US 08/497,202. 

In Fig. 5, Lane 1 first requires alignment with the database results. The alignment 
requires determination of one or more normalization coefficients (for stretching or 
shrinking the results of lane 1 ) to provide a high degree of overlap (i.e. maximize the 
intersection) with the previously aligned database results. The alignment co-efficient(s) 
may be calculated using the Genetic Algorithm method of the above noted application; or 
another method. The normalization coefficients are then applied to Lane 1 . The aligned 
result of Lane 1 is then systematically correlated to each of the 22 known alleles. 

The correlation takes place on a peak by peak basis as illustrated in Fig. 6. Each peak 
in the aligned patient data stream, representing a discrete sequencing reaction termination 
product, is identified. (Minor peaks representing sequencing artifacts are ignored.) The 
area under each peak is calculated within a limited radius of the peak maxima (i.e. 20 data 
points for A.L.F. Sequencer results). A similar calculation is made for the area under the 
curve of the known allele at the same point. The swath of overlapping areas is then 
compared. Any correlation below a threshold of reasonable variation, for example 80%, 
indicates that a peak is present in the patient data stream and not in the other. If one peak 
is missing, then the known allele is rejected as a possible identifier of the sample. 

The reverse comparison is also made: peaks in the known data stream are identified 
and compared, one by one, to the patient sample results. Again, the presence of a peak in 
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one data stream, that is not present in the other, eliminates the known data stream as an 
identifier of the sample. 

In Fig. 5, lane 2, for allele DRB1 *0405, has a peak (marked X) not found in the patient 
sample. Peak comparison between aligned lane 1 and lane 2 will fall below threshold at the 
peak marked X. Lane 3 is for part of known allele DRB1 *0401 . In this case, each peak is 
found to have a correlate in the other data stream. DRB 1*0401 may therefore identify the 
patient sample. (The results illustrated are much shorter than the 200-300 nt usually used 
for comparison, so identity of the patient sample is not confirmed until the full diagnostic 
sequence is compared.) 

Example 2 

Results are obtained from the patient sample according to Example 1 , above The 
sample results are converted into a "text" file as follows. The maxima of each peak is 
located and plotted against the separation from the nearest other peak (minor peaks 
representing noise are ignored). Fig. 7. The peaks that are closest together are assumed to 
represent single nucleotide separation and an narrow range for single nucleotide separation 
is determined. A series of timing tracks are proposed which attempts to locate all the peaks 
in terms of multiples of a possible single nucleotide separation. The timing track that 
correlates best (by least mean squares analysis) with the maxima of the sample data is 
selected as the correct timing track. The peak maxima are then plotted on the timing track. 
The spaces between the peaks are assumed to represent other nucleotides. A text file may 
now be generated which identifies the location of all nucleotides of one type and the single 
nucleotide steps in between. 

The text file for the patient sample is compared against all known alleles. The 
known allele that best matches the patient sample identifies the sample. 

Example 3 

For HLA Class II DRB1 Serological group DR4, 22 alleles are known. A 
hierarchy of single nucleotide sequencing reactions can be used to minimize the number of 
reactions required for identification of which allele is present. Reactions are performed 
according to the methods of example 1, above. 
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If it is established from the group specific reaction that only one DRB1 allele is 
a DR4 subtype, then identification of that allele is made by the following steps: 

1. Determine A nucleotide sequence. This identifies 1 6 of 22 known alleles; 

then 

2. Determine G nucleotide sequence. Identifies 10 of 22 known alleles; then 

3. Combine A and G sequencing results by computer analysis. Identifies all 22 
known alleles. 

If the patient sample is identified at any one step, then the following step(s) 
need not be performed for that sample. 

Example 4 

If the group specific reaction in example 1 indicates that two DR4 alleles are 
present in the patient sample, then from the 22 known alleles, there are 253 possible allelic 
pair combinations (22 homozygotes + 231 heterozygotes). Again, a hierarchy of single 
nucleotide sequencing reactions can be used to minimize the number of reactions required 
for identification of which allelic pair is present. Reactions are performed according to the 
methods of example I , above. 

1 . Sequence G: Distinguishes among 10 homozygote pairs and 64 
heterozygote pairs. 

2. Sequence A: Distinguishes among 16 homozygote pairs and 23 
heterozygote pairs. 

3. Combine A and G sequencing results by computer analysis. Identifies all 
known homozygotes and 169 known heterozygote alleles. 

4. Sequence C: Distinguishes among S homozygotes pairs and 1 8 
heterozygote pairs. 

5. Combine A, C and G sequencing results by computer analysis. Identifies all 
known homozygotes and 219 heterozygote pairs. 

6. Sequence T: Distinguishes one homozygote pair and S heterozygote pairs. 

7. Combine A, C, G and T sequencing results by computer analysis. Identifies 
all known homozygotes and 225 heterozygote pairs. 
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8. If at the end of sequencing the 4 nucleotides, allelic pairs can still not be 
distinguished. Sequence Specific Oligonucleotide Probes may be used to distinguish which 
of the pairs are present, according to the invention. 

If the patient sample is identified at any one step, then the following step(s) 
need not be performed for that sample. 

This example assumes that all alleles will be equally represented among the 
patient samples analyzed. If certain alleles predominate in the population, then it may be 
advantageous to perform reactions definitive for those alleles first, in order to reduce the 
total number of reactions performed. 

Example ? 

Virtually all the alleles of the HLA Class I C gene can be determined on the 
basis of exon 2 and 3 genomic DNA sequence alone (Cereb, N et al. "Locus-specific 
amplification of HLA class I genes from genomic DNA: locus-specific sequences in the 
first and third introns of HLA- A, -B and -C alleles." Tissue Antigens 45: 1-11 (1995)). The 
primers used amplify the polymorphic exons 2 and 3 of all C-alleles without any co- 
amplification of pseudogenes or B or A alleles. These primers utilize C-specific sequences 
in introns 1, 2 and 3 of the C-locus. 

Identification of alleles in a patient sample is performed according to the 
method of example 1, with the following changes. Patient sample DNA is prepared 
according to standard methods (Current Protocols in Molecular Biology, Eds. Ausubel, 
F.M et al, (John Wiley & Sons; 1995)) 

The following primers are used to amplify the HLA Class 1 C gene exon 2: 

Forward Primer; Intron 1 
Primer Name: C2I1 

5' - AGCGAGTGCCCGCCCGGCGA - 3' SEQ ID No : 19 

■ 

Reverse Primer; Intron 2 
Primer Name: C2R12 

5' - Biotin - ACCTGGCCCGTCCGTGGGGGATGAG - 3' SEQ ID NO 20 
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Amplicon size 407 bp. 

The amplification was carried out in PCR buffer composed of 1 5.6 mM 
ammonium sulfate, 67 mM Tris-HCI (pH 8.8), 50 uM EDTA, 1 .5 mM MgC12, 0.01% 
gelatin, 0.2 mM of each dNTP (dATP, dCTP, dGTP and dTTP) and 0.2 mM of each 
amplification primer. Prior to amplification 40 ng of patient sample DNA is added followed 
by 2.5 units of Taq Polymerase (Roche Molecular). The amplification cycle consisted of: 
1 min 96 C 

5 cycles 96 C 20 sec 

70 C 45 sec 

72 C 25 sec 
20 cycles 96 C 20 sec 

65 C 50 sec 

72 C 30 sec 
5 cycles 96 C 20 sec 

55 C 60 sec 

72 C 120 sec 

In a separate reaction, exon 3 of HLA Class 1 C is amplified using the following 
primers: 

Forward primer; intron 2-exon 3 border 
Primer name: C312E3 

5' Biotin - GACCGCGGGGCCGGGGCCAGGG - 3' SEQ ID NO. . 21 

Reverse primer; intron 3 
Primer name: C3RI3 

y - GG AGATGGGG AAGGCTCCCCACT - 3' SEQ ID No. : 22 

Amplicon size 333 bp. 

The same reaction conditions as listed for exon 2 are used to amplify the DNA. 

Sequencing reactions are next performed according to the method of example 1 using 
one of the following 5' fluorescent-labeled sequencing primers: 



Exon 2: 
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Forward sequencing 

5* - CGGGACGTCGCAGAGGAA - 3' (Intron 3) 



SEQ ID No : 25 



Exon 2: 



Reverse sequencing 

5' - GGAGGGTCGGGCGGGTCT - 3 ( (Intron 2) 



SEQ ID NO : 24 



Exon 3 : 



Forward sequencing 

5' - CCGGGGCGCAGGTCACGA - 3' (Intron 1) 



SEQ ID NO : 23 



The termination reaction selected depends on whether a forward or reverse primer is 
chosen. Appendix I lists which alleles can be distinguished if a forward primer is used (i.e. 
sequencing template is the anti-sense strand). If a reverse primer is used for sequencing, 
the termination reaction selected is the complementary one (A for T, C for G, and vice 
versa). 

Homozygotic alleles of HLA Class I C are effectively distinguished by the following 
sequencing order: 

1 . Determine sense strand A nucleotide sequence. Identifies 24 of 3 5 known 
homozygotes; then 

2. Determine sense strand C nucleotide sequence. Identifies 16 of 35 known 
homozygotes; then 

3. Combine A and C sequencing results by computer analysis. Identifies 3 1 of 35 
known homozygotes; 

4. Determine sense strand G nucleotide sequence. Identifies 14 of 35 known 
homozygotes; then 

5. Combine A, C and G sequencing results by computer analysis. Identifies 33 of 35 
known homozygotes. 

The remaining 2 alleles, Cw* 12022.hla and Cw* 1202 l.hla can not be distinguished by 
nucleotide sequencing of only exons 2 and 3. Further reactions according to the invention 
may be performed to distinguish among these alleles. 
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If the patient sample is identified at any one step, then the following step(s) need not 
be performed for that sample. 

Heterozygotes are analyzed on the same basis; the order of single nucleotide 
sequencing reactions is determined by picking which reactions will distinguish among the 
greatest number of samples (data not shown), and performing those reactions first. 

This example assumes that all alleles will be equally represented among the patient 
samples analyzed. If certain alleles predominate in the population, then it may be 
advantageous to perform reactions definitive for those alleles first, in order to reduce the 
total number of reactions performed. 

Example 6 

One lipoprotein lipase (LPL) variant ( Asn291 Ser) is associated with reduced 
HDL cholesterol levels in premature atherosclerosis. This variant has a single missense 
mutation of A to C at nucleotide 1 127 of the sense strand in Exon 6. This variant can be 
distinguished according to the instant invention as follows. 

Exon 6 of the LPL gene from a patient sample is amplified with a 5' PCR 
primer located in intron 5 near the 5' boundary of exon 6 

( 5-GCCG AG AT AC AATCTTGGTG- 3') [Seq ID No. 26] 

The 3' PCR primer is located in exon 6 a short distance from the Asn291Ser mutation and 
labeled with biotin. 

(5'-biotin- C AGGT AC ATTTTGCTGCTTC - 3'). [Seq ID No. 27] 

PCR amplification reactions were performed according to the methods detailed in Reymer, 
PWA., et al., "A lipoprotein lipase mutation (Asn291Ser) is associated with reduced HDL 
cholesterol levels in premature atherosclerosis." Nature Genetics 10: 28- 34 (199S). 

Sequencing analysis was then performed according to the Thermo Sequenase™ 
( Amersham) method of example 1 , using a fluorescent-labeled version of the 5' PCR primer 
noted above. 
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Since the deleterious allele has a C at nucleotide 11 27 of the sense strand, the C 
termination sequencing reaction was performed. The results of the reaction were recorded 
on an automated DNA sequencing apparatus and analyzed at the 1 127 site. The patient 
sample either carries the C at that site, or it does not. If a C is present, the patient is 
identified as having the "unhealthy" allele. If no C is present, then the "healthy" form of the 
allele is identified. Patient reports may be prepared on this basis. 

Example 7 

Health care workers currently seek to distinguish among Chlamydia trachomatis 
strains to determine the molecular epidemiologic association of a range of diseases with 
infecting genotype (See Dean, D. et al "Major Outer Membrane Protein Variants of 
Chlamydia trachomatis Are Associated with Severe Upper Genital Tract Infections and 
Histopathology in San Francisco." J. Infect. Dis. 172:1013-22 (1995)). According to the 
instant invention, the presence and genotype of pure and mixed cultures of G trachomatis 
may be determined by examining the C trachomatis ompl gene (Outer Membrane Protein 

1). 

The ompl gene has at least 4 variable sequence ("VS") domains that may be used to 
distinguish among the 15 known genotypes (Yuan, Y et al. "Nucleotide and Deduced 
Amino Acid Sequences for the Four Variable Domains of the Major Outer Membrane 
Proteins of the 1 5 Chlamydia trachomatis Serovars" Infect. Immun. 57 1040-1049 
(1989)). Logically, to determine presence of a genotype in detectable amounts in a possibly 
mixed culture, the technique must search for a nucleotide which is unique among the 
genotypes at a specific location. For example, genotype H has a unique A at site 284. No 
other genotype shares this A, therefore it is diagnostic of genotype H. Other genotypes 
have other unique nucleotides. On this basis, a preferred order of single nucleotide 
sequencing may be determined, as follows. 

Patient samples were obtained and DNA was extracted using standard SDS/Proteinase 
K methods. The sample was alternatively prepared according to Dean, D et al. 
"Comparison of the major outer membrane protein sequence variant regions of B/Ba 
isolates, a molecular epidemiologic approach to Chlamydia trachomatis infections." J. 
Infect. Dis 166: 383-992 (1992). In brief, the sample was washed once with IX PBS, 
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centrifiiged at I4,000g, resuspended in dithiothreitol and TRIS-EDTA buffer, and boiled 
before PCR. One microliter of the sample was used in a 100 microliter reaction volume 
that contained 50 mM KCI, 10 mM TR1S-CI (pH 8.1 )J .5 mM MgCI2, 100 micromolar 
(each) dATP, dCTP, dGTP, and dTTP, 2.5 U of ampli-Taq DNA polymerase (Perkin- 
Elmer Cetus, Foster City, CA), and 1 50 ng of each primer. The upstream primer was Fl I: 



5' - ACCACTTGGTGTGACGCTATCAG - 3 
(base pair [bp] position 154-176), 

and the downstream primer was Bl 1 : 

5' - CGGAATTGTGCATTTACGTGAG - 3' 

Opposition 1187-1166). 



[Seq ID No. 28] 



[Seq ID No. 29] 



The thermocycler temperature profile was 95 degrees C for 45 sec, 55 degrees C for 1 
min, and 72 degrees C for 2 min, with a final extension of 10 min at 72 degrees C after the 
last cycle. One microliter of the PCR product was then used in each of two separate nested 
100 microliter reactions with primer pair: 
MF2I 

5' - CCGACCGCGTCTTGAAAACAGATGT - 3 1 [Seq ID No 30], and 
MB22 

5' - CACCCACATTCCCAGAGAGCT - 3' [Seq ID No. 31] 

which flank VS1 (Variable Sequence 1) and VS2, and primer pair 
MVF3 

5' - CGTGCAGCTTTGTGGGAATGT - 3' [Seq ID No. 32], and 
MB4 

5' - CTAGATTTCATCTTGTTCAATTGC - 3' [Seq ID No. 33] 
which flank VS3 and VS4 (see Dean D, and Stephens RS. "Identification of individual 
genotypes of Chlamydia trachomatis in experimentally mixed infections and mixed 
infections among trachoma patients." J. Clin. Microbiol. 32:1506-10 (1994).) These 
primer sets uniformly amplify prototype C trachomatis serovars A-K and LI -3, including 
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Ba, Da, la, and L2a. A sample of each product ( 1 0 microliters) was run on a t .5% agarose 
gel to confirm the size of the amplification product. All PCR products were purified 
(GeneClean II, Bio 101, La Jolla, CA) according to the manufacturer's instructions. 

All samples that were positive for presence of C trachomatis by PCR were subjected 
to omp 1 genotyping by single nucleotide sequencing. Amplification for sequencing 
reactions was performed as above using at least one of the above noted amplification 
primer pairs, with a 5' biotinylated version of either one of the primers. 

The biotinylated strand was separated with Dynal beads and selected termination 
reactions were performed as in Example 1 using a 5' fluorescent labeled version of MF21 or 
MVF3. 

The selection of termination reactions depends on the degree of resolution among 
genotypes desired. Only 1-3% of clinical C trachomatis samples contain mixed genotypes. 
Nonetheless, other pathogens are more commonly mixed, such as HIV, HPV and Hepatitis 
C. For all these organisms, it is important to have a method of distinguishing heterogenous 
samples. 

The first 25 nt of the T termination reaction for C. trachomatis VS1 can be used to 
distinguish among 3 groups of genotypes, as illustrated in Fig. 8A. The observed results 
for Sample 1 in Fig. 8 A demonstrates that detectable levels of at least one of Group 1 and 
at least one of the Group 3 genotypes are present. Group 2 is not detected. 

If a higher degree of resolution is required, then further reactions are necessary. To 
distinguish among possible Group Is, the VS1 A reaction is performed. Fig. 8B illustrates 
possible A results. The observed results of Sample I shows an A at site 257. This A could 
be provided by only E, F or G genotypes. Since the T track has already established the 
absence of both F and G, then E must be among the genotypes present. Further, the 
absence of an A at 283 indicates that neither D nor F nor G are present. The presence of E 
and the absence of D, F and G may be reported. 

Other Group 1 genotypes may be present in addition to E; they do not appear because 
their presence is effectively masked by E. Other single nucleotide termination reactions can 
be performed to distinguish among these other possible contributors, if necessary. The 
investigator simply determines which single nucleotide reaction will effectively distinguish 
among the genotypes which may be present and need to be distinguished. 
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Alternatively, Sample 2, which showed the presence of Group 1 only in the T reaction 
is shown to be comprised of only Ba genotype because of an absence of A at 268. This 
shows that both the presence and absence of nucleotides can be used to determine the 
presence of some genotypes in some circumstances. 

The first 25 nt of C and G termination reactions for VS1 only are included in Fig. 8C 
to show how an investigator can determine which reaction to select and perform. If higher 
degrees of resolution are required, the termination reactions for VS2, VS3 and VS4 may be 
performed. 

Not only the genotype, but also variants of D, E, F, H, I and K genotypes (as disclosed 
in Dean, D. et al "Major Outer Membrane Protein Variants of Chlamydia trachomatis Are 
Associated with Severe Upper Genital Tract Infections and Histopathology in San 
Francisco." J. Infect. Dis 172:1013-22 (1995)) may be distinguished by using the above 
single nucleotide sequencing method 

EXAMPLE 8 

The allelic frequencies of HLA Class I C are distributed among Canadians as 

follows: 

Cwl 5.5 

Cw2 4.4 

Cw4 10.0 

Cw5 6.4 

Cw6 9.4 

Cw7 28.9 

Cw9 7.2 

CwlO 5.7 

Cwll 0.5 

Unknown/other 22.0 

On the basis of this data, for a Canadian sample, it is preferable to perform termination 
reactions that preferentially distinguish homozygotes and heterozygotes containing a Cw7 
allele (i.e. Cw*0701 to Cw*0704) first. This should be followed by Cw4, Cw6 and Cw9, 
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etc. Cw7 is preferentially distinguished on the basis of C/G analysis ( 1 22 out of 1 34 
possible combinations. See Appendix 2). (Plus a further 320 out of the remaining 496). 
Cw4 is also preferentially distinguished on the basis of C/G analysis (57 out of 69) (with a 
further 385 out of the remaining 56 1 ). Thus the preferred order of termination reactions is 
as follows: 

1 . Determine sense strand C nucleotide sequence for patient sample exon 2 and exon 3; 

2. Determine sense strand G nucleotide sequence for patient sample exon 2 and exon 3; 
then 

3. Combine G and C sequencing results by computer analysis to identify 442 out of 630 
possible combinations, including 179/195 possible allelic pairs containing at least one Cw7 
or Cw4 allele (38 .9% of Canadian population). 

4. Determine sense strand A nucleotide sequence for exons 2 and 3; 

5. Combine A, C and G sequencing results by computer analysis. Identifies remaining, 
undetermined heterozygotes. 

The only combinations that can not be distinguished after this point include 2 
remaining alleles, Cw* 12022 and Cw* 12021, which can not be distinguished by nucleotide 
sequencing of only exons 2 and 3. Further reactions according to the invention may be 
performed to distinguish among these alleles. Note that since these alleles differ only at a 
silent mutation, they are identical at the amino acid level, and do not need to be 
distinguished in practice. Sample reports can simply confirm the presence of the one allele 
plus either of C w* 1 2022 or * 1 202 1 . 

If the patient sample is identified at any one step, then the following step(s) need not 
be performed for that sample. 

EXAMPLE 9 

Analysis of the HLA-DRB1 allelic type of a sample may be performed according to 
Example 1 using two chain terminating nucleotides. 100 ng of patient sample DNA 
(previously amplified as in Example 1 ) is combined with labeled sequencing primer: 
5' - GAGTGTCATTTCTTCAA - T [SEQ ID NO. 18] 

(30 ng (5 pM total)); in 2X sequencing buffer (52 mM Tris-HCI, pH 9.5, 13 mM MgC12); 
and 2 U of Thermo Sequenase enzyme (Amersham Life Sciences, Cleveland) in a final 
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volume of 3 ul. This sequencing pre-mix is kept on ice until ready to use, and then 
combined with 3 u! of one of the following termination mixtures: 

A/C termination reaction: 
750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddATP; 2.5 uM ddCTP 

A/G termination reaction: 
750 uM each of dATP, dCTP, dGTP, and dTTP; 2.5 uM ddGTP; 2.5 uM ddATP 

Total termination reaction volume: 6 ul 

The termination reaction mixture is thermal cycled in a Robocycler for 30 cycles (or fewer 
if found to be satisfactory): 
95 C 40 sec 
50 C 30 sec 
68 C 60 sec 

After cycling 12 ul of loading buffer consisting of 100% formamide with 5 
mg/ml dextran blue is added to the termination reaction mixture, and an appropriate volume 
(i.e. 15 ul) is loaded on to an automated DNA sequencing apparatus, such as a Visible 
Genetics OPEN GENE™ System. 
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Appendi* I 

HLA Class I C locus: allele analysis on the basis of exons 2 and 3, 

Sequences obtained from the Strasbourg Baia Base 

Internet Address - ftp;//FTP.EMBL-Heidetber8 T PE/pwb/databases 



35 known alleles for MhA Class J C locus. 




1: Cw*0101.hla 


IS; Cw'OWi.hla 


2iXw*oi02,hla 


1?; Cw*08Q2JUa 


3: Cw»0201.hla 


20; Cw*08Q3,hla 


4; Cw*02fi2Uila 


21; Cw*12Ql,hla 


5; Cw-02022Jlla 


22; Cw*lM21.hla 


$; Cw*Q391.hla 


23: Cw*12022.hla 


7; Cw*0302Jila 


24; Cw*12Q2Jila 


8: Cw*0303.hla 


25: Cw*1301.hla 


9: Cw*Q3Q4.hla 


2$; Cw*1402J«!a 


10: Cw*0401.hla 


27; Cw*1403,bla 


ll:Cw«0402.hla 


28: CwMSOl.hla 


12:Cw*0501.hla 


29: CwMS02.hla 


13; Cw*06ff2Jila 


30: Cw*1503.hla 


14: Cw*0702.hla 


31; Cw*|505,hla 


15; Cw*9701.hla 


32: Cw*1504.hla 


1$; Cw*Q703,hla 


33: Cw*J60l.hla 


17; Oy*0704.M» 


34; Cw*1602.hla 




35; CwM701.hla 



35 alleles may be combi ned as 35 ho mozygous pairs or 630 heterozygous pairs. 
Homozygous pairs mav be distinguished bv single nucleotide sequencing in the 

following order; 
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Non-Unique Sequences using A: 

Cw«0102.hla = rCw»0102.hlal 
Cw-OlQI.hlfl ■ JCwMHOl.hlal 

Cw*07oi.hi a = rrw*o70i.hi fl ^ 

Cw*0702.hla = (Cw*0702.hlal 

Cw»12022.hla ■ JCw«12022.hla. Cw«1203.hla l Cw«12021.hla = fCw«12n2lhla. 
CwM203.hla> Cw«12021.hla - (Cw«1202l.hla. Cw»12022.hlal Cw«1503 hla = 
rCw«1503.hla) 

Cw*1S02.hla = <C w »1S02.hlal 
Cw»lS04.hla = <Cw«lS04.hlal 
CwMSOShla = rCw«lS0S.hlat 



Unique Sequences using A: 

L_£ttgfl2ffima 

2: Cw«02021.hla 
3: Cw«02022.hla 
4: Cw"0301 hla 
S: Cw»0302.hla 
6: Cw«0303.hla 
7: Cw«0304.hla 
8: Cw»0401.hla 
9: Cw«0402hl fl 
10: CWOSOI.hla 
ll:Cw»0602.hlii 
12: Cw«0703.hla 



13: Cw«0704.hla 
14: Cw«Q801hla 

15; Cw*Q8Q2.Ma 

16: Cw«0803.hla 
17: Cw« 1201 .hla 
18: Cw«1301.hia 
19: Cw«1402.hla 
20: Cw'1403.hla 

21; Cw* 1 501. hla 

22: Cw« 1601 .hla 
23: CwM602.hla 
24: f!w« 1701 .hla 



Non-Unique Sequences using C: 

Cw«02Q22.h1a = f Cw«02022.hla) 
CW02021.hla = fCw*02021.hlal 
Cw»0304.hla = f Cw*0304.hla) 
Cw«0303.hla - fC W «0303.hlal 
C W *0802.hla = i Cw«0802.hlal 
Cw*0803.hla = fCw«0803.hlal 
Cw'OSOl.hla - <Cw*0S01.hlal 
Cw«0801.hla = fCw«0801.hlal 
Cw«12022.hla = fCwM2022.hlal 
Cw»12021.hla = fCWllttll.kirt 
Cw«lS04.hla = fC W «lS04.hla1 
Cw«1403.hla-<Cw«1403.hlal 

Cw«)402.hto - (Cw*14Q2.hla) 



CwMS03.hla = rCwMS03.hla. 
Cw«lS0S.hla)CwMSO2.hla- 
fCw«1S02.hla. CwMSOS.hlal 

Cw^SQ2.hla = (CwMSQ2.M«. 
CwMSW.hla) Cw*1203.hl« = 
(Cw*12Q3.Wa) 

C W «1602.hla-rCw«1602hlal 

CwMffl.Hi - CCwMWl.Ua) 



WO 97/23650 



PCT/US96/20202 



-35- 



Unique Sequences using C: 
1:Cw«Q101.hla 

2; Cw'Qlff2.hto 

3: Cw«Q201.hla 
4! Cw«0301.hla 

S; Cw* Q3 02 .Ma 
6; Cw*Q401.hlfl 



7; Cw*O402.hlfl 

8: Cw»Q602.hla 
9- Cw«0702.hla 
10: Cw«0701.hla 

Us Cw*Q703.hla 

12: Cw«0704.hla 



13: CwM201.hla 

14; Cw'l jQl.hla 

15: CwMSOI.hla 

16; CWl 7tl.Mii 



Non-Unique Sequences using G: 

Cw'02022.hla = fCw«02022.hlal 
Cw»02021.hla - rCw«02021.hlal 

Cw»0303.hl« = fCw«0303.hl *. Cw»0304.hla. Cw«0801.hla. Cw»0803.hla. 
Cw*1601.hhi. Cw«1602.hlat 

Cw«0302.hla = fCw«0302.hla. Cw«0304.hla. Cw«0801.hla. Cw*0803.hla. 
CwM601.hla. Cw*1602.hla1 

Cw«0302.hla - fCw«0302.hla. Cw«0303.hla. Cw«0801.hla. C w«0803hla. 
Cw«1601.hla. CwM602.hlal 

CwM2021 hla = fCw«12021.hla. Cw«12022.hla. Cw«1203.hla. Cw«1301.hlal 
Cw«0302.hlii = rCw«0302.hl n. Cw«0303.hla. Cw»0304.hla. Cw«0803.hla. 
CwM601.hln.CwM602.hla> 

CwM402.hla = <Cw«1402hla . Cw«1403.hlal Cw»0302.hla = <Cw«0302.hla. 
Cw«0303.hla. Cw«0304.hla. Cw*0801.hla. CwM601.hla. Cw*1601.hlal 
Cw«0401.hla = fCw«0401.hla . Cw«12022.hta. Cw«1203.hla. CwM301.hial 
Cw«0401.hla = fCw«0401.hla. Cw«12021.hla. Cw»1203.hla. Cw«1301.hla) 
Cw«0401.hla = <Cw«0401.hla. CwM2021.hla. CwM2022.hla. CwM301.hla) 
Cw«0401.hla = fCw«0401.hla. Cw«12021.hla. CwM2022.hl a . CwM203hlal 
Cw«0802.hl« = fCw*0802.h1n. CwM403.hlal Cw«0802.hla = fCw«0802.hla. 
Cw«1402.hlal Cw«1S02.hla = fCw«1502.hla. CwMSOS.hla. Cw«1S04.hlal 
CwMSOl.hla = fCwMSOI.hla. CwMSOS.hia. Cw»lS04.hla) CwMSOLhla = 
fCwMSOI.hla. CwMS02.hla. Cw«1504.hlal CwMSOl.hla = fCwMSOLhla. 
Cw«1S02.hla. Cw"1S0S.hlal Cw«0302.hla = fCw«0302.hla. Cw*0303.hla. 
Cw«0304.hla. C w«080l.hla. Cw«0803.hla. Cw«1602.hlal 
Cw»Q302.hla = fCw«0302.hla. Cw«0303.hla. Cw«0304.hla. Cw«0801.hla. 
Cw*0803.hla. Cw«1601.hla) 



Unique Sequences using G: 

1:Cw»0101.hla 2: Cw«0102.hla 3: Cw«0201.hla 
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4; Cw*9301.hla 6: Cw«osoi.hia 

S: Cw«0402.hla 7: Cw»0602.hla 

9: Cw*Q7Q2,h|a 

9! ew«0701.hl« 
10: Cw»0703.hla 



ll:Cw«0704.hla 
12: Cw»1201.hla 
13: Cw«lS03.hl» 
14: Cw«1701hl a 



Non-Unique Sequences using T: 

C W «0102.hla = fCw«0102.hlal 
Cw«0101.hla = fCw»Q101.hlal 

Cw«02021.hla = fCw«02021. hla. Cw«02022.hlal Cw«0201.hla = rCw'tttOl.hla. 
Cw«02022.hla) Cw«0201 hla ■ rCw«0201.hla. Cw*02021.hlal Cw«0303.hla - 
<Cw«0303.hl«. Cw'0304.hlal Cw«0302.hla = fCw«0302.Ma. Cw«0304 hlal 
Cw«0302.hla = JCw«0302.hla. Cw*0303.hlal Cw«0402.hla = (Cw*0402.hla) 
Cw«0401.hla - <Cw*0401.hlal 

Cw«0801.hla = fCw«0801.hla. Cw«0802.hla. Cw«0803.hlal Cw*0701.hla = 
rCw«070l.hl«l 

Cw*0702.hla = rCw«Q7Q2.hlal 

CWOSOl.hla - fCw'OSOl.hla . Cw«08Q2.hla. Cw«0803.hlal Cw'OSOl.hla = 
fCw'OSOl.hla. Cw«Q801.hla. Cw*0803.hlal Cw«OS01.hla = JCWOSOl.hla. 
Cw«0801.hla. C W «08Q2.hlal CwM2022.hla = ICwM2022.hla. CwMMLMrt 
Cw«12021.hla = <CwM2021 h la. Cw«1301.hla) CwM2021.hla - JCw«12021.hla. 
Cw»12022.hlal Cw«14Q3hla - fCwM403.hla> 
CwM402.hla - <CwM402.hla) 

CwMS03.hla = <CwMS03.hla. CwMSOS.hlal Cw*lS02.hla = fCwMS02.hla. 
Cw«150S.him CwMS02.hla = fCw«1502.hla. Cw«1503.hlal Cw«1602.hla = 
<Cw»1602.htal 

Cw«1601.hla = fCwM601.hla\ 
Unique Sequences using T: 

L-CB^flMLhla 4: Cw«0704.hla 7: CwMSOl.hla 

1: Cw»0CT2.hlft ft Cw'UQl.hln ft CWlSMAli 

?; Cw*Q7Q3.hl« 6: CwM203.hla 9: Cw*1701.hla 



Non-Unique Sequences using AC: 

Cw*12Q22.hla = (Cw*12022,hla) 

Cw« 1 2021 .hla = i Cw« 1 202 1 JrfaJ 
CwMS03.hla = fC W «lS03.hlal 
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CwMS02.hla = (Cw«1502.hla) 



Unique Sequences aging AC: 



l ;C w * Oloi.h la 

2: Cw«0102.hla 
3: Cw«0201.hla 
4: Cw»02021.hla 

5; Cw»Q?022.hto 

6: Cw«0301.hla 
7: Cw«0302.hla 
8: Cw*0303.hla 
9: Cw»0304.hla 
IQtCwMMOl.hla 
ll:C W «0402.hl» 



12: Cw«0S01.hla 
13: Cw«0602.hla 
14: Cw«0702.hla 

IS; Cw*Q7Qt.hl» 
16; Cw*Q7Q3.hl» 

17: Cw«0704.hla 

18 ; Cw* Q8Ql .h)a 

19: C W *0802.hla 
20: Cw*0803.hla 
21:CwM201.hla 
22: CwM203.hla 





Cw*1301.hla 


24; 


Cw*1402.hla 


25; 


: Cw*14Q3,hla 


26; 


; Cw*1501.hla 


27; 


; Cw*15Q5.hla 


28; 


i Cw*I504.hla 


2?; 


; Cw*l«91.hla 


3Q: 


: CwM602.hla 


31 


; Cw*|701,h|a 



Non-Unique Sequences using AG: 



CwM2022.hla = fCw«12022.hla. Cw»1203.hla) Cw«12021.hla = fCw«12021.hla. 
Cw«1203.hlal CwM2021.hla = (Cw«12021.hla. CwM2022.hlal CwMS04.hla = 

(CwM?04.hla) 

CwMSOSJila = (Cw»lSOS.hlal 



Unique Sequences using AG: 



1; CWOlQl.hla 


19; 


Cw*0802.hla 


2; Cw*01Q2JUa 


20; 


Cw*0803.hla 


3: Cw*0201.hla 


21; 


Cw*1201.hla 


4; Cw*Q2921.hla 


22; 


Cw*13Ql.h)a 


S: Cw«02022.hla 


23; 


; Cw*14Q2Jila 


6; Cw*Q301,hla 


24; 


; CW1403.bla 


7; C W »03Q2,hla 


25' 


; Cw*lS01.hla 


8: Cw*0303.hla 


26 


; Cw*1502,bla 


9: Cw*0304.hla 


27; 


: Cw*1503.hla 


10: Cw*0401.hla 


29 


; Cw*1601.hla 


U:Cw»0402.hla 


2? 


; CwM60Lhia 


12: CWOSOl.hla 


30 


; Cw*17Ql.hla 


13; Cw»0602Jila 






14: CW0702 hla 






IS: Cw*0701.hla 






16: Cw«0703.hla 






17: Cw*0704.hla 






18: Cw*0801.hla 
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Non-Uniquc Sequences using ATi 
Cw»0102.hla = rCWOI02.hlal 
Cw«0101.hla = fCw«0101.hlal 
Cw«0701.hla = (Cw«070hhlal 
Cw«0702.hla = /Cw«0702.hlal 
Cw«12022.hla = (CwM2022.hl a > 
CwM2021.hla - fCwM2021.hlal 
Cw'1503.hla ■ fCw«lS03.hlal 
Cw«lS02.hla = fCw*lS02.hlal 



Unique Sequences using AT: 



2; 


Cw*02«l.hla 


3; 


; Cw*02S22Jila 


4; 


; Cw*03Qi,hl« 


5; 


: Cw*0302.hla 




: Cw»Q3Q3.hla 


7 


: Cw*0304.hla 


8 


; Cw*040i.hla 


9 


: Cw*0402.hla 



10: CWOSOI.hla 
ll:CW0602.hla 
12: Cw«0703.hla 
13: Cw«0704.hla 
14: Cw«0801.hla 
IS: Cw«0802.hla 
16: Cw«0803 hla 
17: CwM201.hla 
18: Cw«1203.hla 



19: Cw* 1301. Ma 
20: Cw»1402.hla21: 



22; 


; Cw*lSQ1.hlA 




; CwM595.hla 


24; 


; OrMSHMa 


25; 


; CWlMl.hln 


2d; 


:CwM602.hla 


27' 


: CwM701.hla 



Non-Unique Sequences using CG: 
Cw»02022.hlfl ■ f Cw»02022.h1al 
Cw"02021.hla = fCw«02021.hla) 
Cw«0304.hla = fCw«0304.hlal 

Cw*Q?Q3.hl« = (Cw*Q3Q3.hta) 

Cw»0803.hla = rCw«0803.hlal 
Cw'0801.hla = fCw*0801.hla) 
Cw*12022hla = fCw«12022.hlal 
CwM2021 hla = fCw«12021.hla) 
Cw«1403.hla = fCw«1403.hla) 
Cw»1402.hla = rCw«1402.hla> 

Cw*|SQS.hla = (C w <> lsos .hla) 

Cw«IS01.hla = fCw«lS02.hlal 

rCw«1602.hlal 

(CwM(f01,hlft) 



Cw'UOI.hlfl 



Unique Sequences using CG: 
l:CW0101.hla 2: Cw«0102.hla 



3: Cw«0201.hla 
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5: Cw«0302.hla 
6: Cw«040l.hla 

7; Cw*P4Q2.h»a 
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8r Cw*0S01.hla 
9: Cw*0602.hla 
10: Cw'0702.hla 

U; Cw*07Ql.hla 

12; Cw«0703.hla 

13; Cw*P7P4.hla 

14; Cw«0802.hla 



15; Cw*1201.hla 

16; Cw«1 203.hla 

17;Cw*13Ql.hla 
18; CwMSQl.hla 

19; Cw«lS03.hla 
20: Cw«1S04.hln 
21:Cw«1701.hla 



Non-Unique Sequences using CT: 

Cw«02022.hla = fCw*02022.hlal 
C W «02021 hla = (Cw«02021.htal 
Cw«0304.hla = fC W «0304.hla> 

Cw*0303.hia = (Cw*P3Q3.hla) 

Cw«0802.hla = fCw«0802.hlal 
C W «0803.tila = fCw«0803.hlal 

Cw*Q5Ql.hla°(Cw*ogPl.hla) 

Cw«0801 Jila = fCw«0801.hlal 
Cw«12022hla = (CwM2022.hla) 
CwM2021.hla = <Cw«12021.hla) 
Cw* 1403.hla = fC W «1403.hla) 

C w * M Q2. h la = (C w*H Q2.M a) 

CwMS03.hla = (CwM503.hla. CwMSOS.hlal Cw'1S02.hla = fCwMS02.hla. 
CwMSOS.hlal Cw»IS02.hla = (Cw*1S02.hla. Cw«1SQ3.hla) Cw«1602.hla = 

(Cw'ltt)2.hla) 

CwM601.hla = fCw«1601.hlal 
Unique Sequences using CT: 



l:Cw«0101.hla 

2; Cw*QlQ2.hla 

3: Cw«0201.hla 

4; Cw«Q3Ql.hla 

5: Cw«0302.hla 
6: Cw*0401.hla 



7: Cw«0402.hla 
8: Cw«0602.hla 
9: CW0702.hla 
10: Cw«0701.hla 
11:Cw«0703.hla 

12; Cw«Q7Khla 



13:Cw«1201.hla 
14: CwM203.hla 

15; C W *i3Pl.hla 

16:Cw«150l.hla 
17: C w «lS04.hla 

lib Cw*1701.hla 



Non-Unique Sequences using GT: 



Cw»02022.hla = fC W «02022.hla) 
Cw*02021.hla g fC W «02021.hlal 
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Cw«0303.hla = fCw«0303.hla. Cw'0304.hlal Cw«0302.hla = fCw«u302 Ma. 
Cw»0304.hlal CW0302.hla g fCw«0302.hla. Cw«0303.hla) Cw»0803.hla = 
<Cw«0803.hlal 

Cw*0801.hla « fCw«0801.hl«l 

Cw»12022.hla = fCw'12022.hla. Cw«1301.hlal CwM2021.hla = fCw»12021.hla. 
CwM301.hlal CwM2021.hl» = (Cw«12021.hl a. CwM2022.hlal Cu-M403.hla = 
fCw«1403,hhil 

CwM402.hla = fCwM402.hhil 
CwMSOS.hla = fCw'lSOS.hlal 
CwMS02.hla = fCw«lS02.hlal 
rw*1602.hla = <rw«1602.hla> 
CwM601.hla = fCw«1601.hla\ 



Unique Seq uences using GT; 



l:Cw«0101.hla 
2: Cw«0102.hta 
3: C W *02Ql.hla 
4:C W «0301.hla 
St Cw«0401.hla 

6; C w«Q 4P?. hla 

7: Cw«0S01.hla 



8: Cw«0602.hla 

?; Cw*07Q2,Ha 
Iff; Cw*0701.hl» 

ll:Cw»0703.hla 

12; C w»97Q4 .hl« 

13: Cw»0802.hla 

14; Cw'UQl.hla 



15; Cw'l*Q3.tila 

16: CwMSOl.hla 
17: Cw*1503.hla 
18; Cw«1S04.hla 
19: Cw«1701.hla 



Non-Unique Sequences using ACG: 

Cw»12Q22.Ma = (Cw« 12Q22.hla) 

Cw«12021.hla = fCW>12021.hlal 



Unique Sequences using ACG; 



1: Cw«0101.hta 
2: Cw*01u2.hla 
3: Cw«0201 hla 
4: Cw*02021.hla 
S: Cw«02022.hla 
6: Cw«03u1.hl« 
7: Cw«0302.hla 
8: Cw«0303.hla 
9: Cw*0304.hla 

10; Cw*Q4Ql.hla 

ll:Cw»0402.hla 
12: Cw'OSOl.hla 
13; C W »0602.hla 
14: Cw«0702.hla 



IS: Cw'0701.hla 

16; Cw*Q7Q3.hla 

17: Cw«07u4.hla 
18: Cw«0801.hla 
19: Cw«0802.hla 
20: Cw«0803.hla 
21: C W «1201.hla 
22: Cw«1203.hh 
23: CwM301.hla 
24: Cw'1402hla 
2S: CwM403.hla 



26; CwMSQLhla 

27: CwMS02.hla 
28:Cw*lS03.hla 
29: CwMS0S.hla 
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30: CwM504.hla 
31: Cw« 1601. his 

32; Cw"lfltt.hto 

33: CwM701.hla 



Non-Unique Sequences using ACT; 

Cw«12022.hla = fCwM 2022.hla) 
CwM2021.hla ■ (CwM2021.hlat 

Cw*lSQ3.hla = (Cw«lSQ3.hja) 

CwMS02.hla = fCwMS02.hlal 
Unique Sequences using ACT: 
l;Cw«0101.hla »; Cw 



2: Cw«0102.hla 
3: Cw»0201.hla 

4; Cw"Q3P21.hta 

S: Cw«02022.hla 
6: Cw»030i.hla 

7; Cw*Q3Q2,Ma 

8: Cw»0303.hla 

9; Cw«Q3Q4.h»a 
10; C w*94 P l .lila 

ll:Cw«Q402.hla 



*Q5Ql.ula 

13: Cw«0602.hla 
14:Cw«0702.hla 
IS: Cw«0701.hla 
16: Cw«0703.hla 
17: Cw«0704.hla 
18: Cw»0801.hla 
19: Cw«0802hla 
20: Cw*0803.hla 
21:CwM201.hla 

22 ;Cw*l 2Q3 .hla 



23: Cw»1301.hla 

24; Cw*1492,hla 

2S: Cw«1403.hla 

26; Cw'isoi.hta 

27: CwMSOSJila 
28: CwM504.hla 

29'. Cw*»6Q|,h>a 

30; C W "1602.h»a 
31:CW1701.hla 



Non-Unique Sequences using AGT: 
CwM2022.hla = fCw'12022.hlal 
CwM2021.hU = rCw»12021.hlal 



Unique Sequences using AGT: 

1:Cw»0101.hla 2 
2: Cw«OI02.hla 1 

3; C w«Q2 Ql,hlft 1 
4: Cw«02021.hla 1 



s; Cw*Q2Q27.hla 

6: Cw«0301.hla 
7: Cw«0302.hla 
8: Cw«0303.hla 



: Cw*0304.hla 
0: Cw«0401.hla 
l:Cw«0402.hla 
2: Cw«0S01.hla 

13; Cw«Q6P2.hla 

14: C W *0702.hla 
IS: Cw«0701.hla 
16: C W «0703.hla 



17; Cw*Q704.hla 

18: Cw«0801.hta 
19: Cw«0802.hla 
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Cw*0803.hla 



Cw«1201.hla 

Cw«12P3.hla 

Cw«1301.hla 
CwM402.hla 
Cw«1403.hla 
CwMSOl.hla 
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27: Cw» 



28: C W «1S03.hla 
29: Cw*lS0S.hla 
30: e W «1504.hlii 

31; Cw*|6Ql,hlfl 



33; Cw*17Q|,hl« 



Non-Unique Sequences using CGT: 
Cw«02022.hla » rCw*02022.hla> 

Cw« P2 Q21.hla 5 (C w« P 2 Q2 1.hla) 

Cw«0304.hla = <Cw«0304.hlal 

C w»P3 Q 3 .hla = (Cw*Q3P3.hla) 

Cw«0803.hla = (Cw«0803.hlal 
r W «0801 hla = fC W «0801.hlal 
r W M2022.h»a = <CwM2022.hlal 
CW12021.hln = fCw«12021.hlal 
Cw"1403.hla =» JC W «1403.hlal 
CwM402.hla = fC W M402.hla) 
Cw*lS05.hla » fCwMSOS.hla) 
CwMS02.hla = <CwMS02.hl fl > 
Cw*1602.hla = fC W M602.hlal 
Cw«1601.hla = <Cw«1601.hial 

Unique Sequences wing CGT; 

l!Cw«0101.Mi 8; C W 



2: Cw«0102.hla 

3; Cw* 02Q l.hla 

4: Cw«0301.hla 
S: Cw«0302.hla 

fc Cw'04Ql.hla 

7: Cw*0402.hla 



*0S01 .hla 
9: Cw*0602.hli> 
10: Cw»0702.hla 
11:Cw«O701.hl« 
12: Cw«0703.hla 
13: Cw«0704.hla 
14: C W *0802.hla 



15:Cw"1201.hla 

1ft CwM2P3.h»a 

17: Cw* 1301 .hla 

18; Cw* I SO I. hla 

19: CwMSQ3.hla 
20: Cw»1S04.hla 
21:Cw»1701.hla 
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Non-Unique Sequences using ACGT: 
Cw«12022.hla = fCwM2021.hla) 
Cw*12021.hla = fCw«12022.hlal 

Unique Sequences using ACGT: 



l:Cw»0101.hla 
2: Cw«0102.hla 
3: Cw«0201.hla 
4: Cw«02021.hla 
S: Cw«02022.hla 
6: Cw*0301.hla 
7: Cw«0302.hla 

8; Q«p3p3,h| a 

9: Cw«0304.hla 

10; Cw*Q40Uhl» 
11;Cw«Q4Q2.hl» 



12: Cw'OSOl.hla 
13: Cw»0602.hla 
14: Cw«0702.hla 

15; Cw*07Q1.hta 

16: Cw»0703.hla 
17: Cw»0704.hla 
18: Cw«0801.hla 
19: Cw*0802.hla 
20: C W «0803.hla 
21:Cw'1201.hla 
22: Cw«1203.hla 



23: CwM301.hla 
24: Cw»1402.hla 
25: Cw*1403Jilii 
26: CwMSOl.hl* 
27: Cw*lS02.hla 
28: Cw«lS03.hla 

29; Cw'ISQS.hto 

30: CwM504hl« 
31: Cw« 1601 Ala 
32: CwM602.hla 
33: Cw«1701.hla 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(1) APPLICANT: Stevens, John K. 

Dunn, James M. 
Leushner, James 
Green, Ronald 

(ii) TITLE OF INVENTION: Method for Evaluation of 
Polymorphic Genetics Sequences, and Use Thereof in 
Identification of HLA Types 

(iii) NUMBER OF SEQUENCES: 33 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Oppedahl & Larson 

(B) STREET: 1992 Commerce Street Suite 309 

(C) CITY: Yorktown 

(D) STATE: NY 

(E) COUNTRY: US 

(F) ZIP: 10598 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette - 3.5 inch, 1.44 Mb storage 

(B) COMPUTER: IBM compatible 

(C) OPERATING SYSTEM: MS DOS 

(D) SOFTWARE: Word Perfect 

(vi) CURRENT APPLICATION DATA : 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY /AGENT INFORMATION : 

(A) NAME: Larson, Marina T. 

(B) REGISTRATION NUMBER: 32,038 

(C) REFERENCE /DOCKET NUMBER: VGEN. P-019-WO 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (914) 245-3252 

(B) TELEFAX: (914) 962-4330 

(C) TELEX: 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR1 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
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TTGTGGCAGC TTAAGTTTGA AT 22 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR1 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CCGCCTCTGC TCCAGGAG 18 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR1 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCCGCTCGTC TTCCAGGAT 19 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR2 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TCCTGTGGCA GCCTAAGAG 19 
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(2) INFORMATION FOR S1Q ID £*0: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR2 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CCGCGCCTGC TCCAGGAT 18 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR2 
alleles of HLA Class II genes 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
AGGTGTCCAC CGCGCGGCG 19 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR3 , 8, 
11, 12, 13, 14 alleles of HLA Class II genes 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
CACGTTTCTT GGAGTACTCT AC 22 

(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR3, 8, 
11, 12, 13, 14 alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CCGCTGCACT GTGAAGCTCT 20 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE : yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR4 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GTTTCTTGGA GCAGGTTAAA CA 22 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR4 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTGCACTGTG AAGCTCTCAC 20 



(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for DR4 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CTGCACTGTG AAGCTCTCCA 20 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR7 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCTGTGGCAG GGTAAGTATA 20 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: ampl if ication primer for DR7 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CCCGTAGTTG TGTCTGCACA C 21 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSB: yes 

(v) FRAGMENT TYPB: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR9 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GTTTCTTGAA GCAGGATAAG TTT 23 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE : 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR9 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CCCGTAGTTG TGTCTGCACA C 21 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR10 
alleles of HLA Class II gene6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CGGTTGCTGG AAAGACGCG 19 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for DR10 
alleles of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CTGCACTGTG AAGCTCTCAC 20 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: sequencing primer for DR alleles 
of HLA Class II genes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GAGTGTCATT TCTTCAA 17 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for HLA-C 
gene, exon 2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AGCGAGTGCC CGCCCGGCGA 20 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for HLA-C 
gene, exon 2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
ACCTGGCCCG TCCGTGGGGG ATGAG 25 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for HLA-C 
gene, exon 3 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 21: 
GACCGCGGGG CCGGGGCCAG GG 22 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL : no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for HLA-C 
gene, exon 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GGAGATGGGG AAGGCTCCCC ACT 23 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI* SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
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(D) OTHER INFORMATION: forward sequencing primer for 
HLA-C gene, exon 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
CCGGGGCGCA GGTCACGA 18 



(2) INFORMATION FOR SEQ ID NO: 24: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: forward sequencing primer for 
HLA-C gene, exon 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GGAGGGTCGG GCGGGTCT 18 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: reverse sequencing primer for 
HLA-C gene, exon 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
CGGGACGTCG CAGAGGAA 18 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplification primer for exon 6 
of lipoprotein lipase gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
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GCCGAGATAC AATCTTGGTG 20 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(D) OTHER INFORMATION: amplif ication primer for exon 6 
of lipoprotein lipase gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
CAGGTACATT TTGCTGCTTC 20 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
ACCACTTGGT GTGACGCTAT CAG 23 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: ampl if ication primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CGGAATTGTG CATTTACGTG AG 22 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CCGACCGCGT CTTGAAAACA GATGT 25 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CACCCACATT CCCAGAGAGC T 21 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplif ication primer for 
Chlamydia ompl gene 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CGTGCAGCTT TGTGGGAATG T 21 
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(2) INFORMATION FOR SBQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Chlamydia 

(D) OTHER INFORMATION: amplification primer for 
Chlamydia ompl gene 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CTAGATTTCA TCTTGTTCAA TTGC 24 
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CLAIMS 

1 A method for identification of allelic type of a known polymorphic 
genetic locus in a sample comprising the steps of: 

(a) combining the sample with a sequencing reaction mixture containing 
a template-dependent nucleic acid polymerase. A, T, G and C nucleotide feedstocks, 
one type of chain terminating nucleotide and a sequencing primer under conditions 
suitable for template dependant primer extension to form a plurality of oligonucleotide 
fragments of differing lengths, the lengths of said fragments indicating the positions of 
the type of base corresponding to the chain terminating nucleotide in the extended 
primer; and 

(b) evaluating the length of the oligonucleotide fragments thereby 
determining the position of the positions of the type of base corresponding to the chain 
terminating nucleotide in the extended primer, characterized in that herein the sample is 
concurrently combined with at most three sequencing reaction mixtures containing 
different types of chain terminating nucleotides. 

2. The method of claim 1, wherein the sample is combined with a 
single sequencing reaction mixture containing at most two chain terminating 
nucleotides, and the lengths of the oligonucleotide fragments produced are evaluated 
prior to combining the sample with any further sequencing reaction mixture. 

3. The method of claim I, wherein the sample is combined with a 
single sequencing reaction mixture containing only one chain terminating nucleotide, 
and the lengths of the oligonucleotide fragments produced are evaluated prior to 
combining the sample with any further sequencing reaction mixture. 

4. The method of any of claims I to 3, wherein the sample is amplified 
prior to combining it with the sequencing reaction mixture to enrich the amount of the 
polymorphic genetic locus. 

5. The method of claim 4, wherein the amplification is performed using 
polymerase chain reaction amplification. 
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6. The method of any of claims 1 to 5, characterized in that the length 
of the oligonucleotide fragments is evaluated by electrophoretic separation on a 
denaturing gel. 

7. A kit for identification of allelic type of a polymorphic genetic locus 
in a sample comprising, in packaged combination, 

(a) a sequencing primer adapted to hybridize to genetic material in the 
sample near the polymorphic genetic locus; and 

(b) two or more chain terminating nucleotides, wherein a first of said 
chain terminating nucleotides is provided in an amount which is five or more times 
greater than the amount of any other chain terminating nucleotide. 

8. The kit of claim 7, wherein the first chain terminating nucleotide is 
dideoxyadenosine. 

9. The kit of claim 7, wherein the first chain terminating nucleotide is 
dideoxycytosine. 

10. The kit of claim 7, wherein the first chain terminating nucleotide is 
dideoxythymine 

1 1 . The kit of claim 7, wherein the first chain terminating nucleotide is 
dideoxyguanosine. 

12. A method for determining the allelic type of a polymorphic gene in a 
sample comprising the steps of: 

(a) combining a first aliquot of the sample with a first sequencing 
reaction mixture containing a template-dependent nucleic acid polymerase, A, T, G and 
C nucleotide feedstocks, a first type of chain terminating nucleotide and a sequencing 
primer under conditions suitable for template dependant primer extension to form a 
plurality of oligonucleotide fragments of differing lengths, the lengths of said fragments 
indicating the positions of the type of base corresponding to the first type of chain 
terminating nucleotide in the extended primer; 

(b) evaluating the length of the oligonucleotide fragments to determine 
the positions of the type of base corresponding to the first type of chain terminating 
nucleotide in the extended primer; and 
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(c) comparing the positions of the type of base corresponding to the 
first type of chain terminating nucleotide in the extended primer to the positions found 
in known alleles of the gene whereby the sample can either be assigned as being of a 
particular type or is assigned as ambiguous for further evaluation. 

13. The method of claim 12, wherein the sample is ambiguous after 
comparing the positions of the type of base corresponding to the first type of chain 
terminating nucleotide in the extended primer to the positions found in known alleles 
of the gene, further comprising the steps of 

combining a second aliquot of the sample with a second sequencing 
reaction mixture containing a template-dependent nucleic acid polymerase, A, T, G and 
C nucleotide feedstocks, a second type of chain terminating nucleotide, different from 
said first type, and a sequencing primer under conditions suitable for template 
dependant primer extension to form a plurality of oligonucleotide fragments of 
differing lengths, the lengths of said fragments indicating the positions of the type of 
base corresponding to the second type of chain terminating nucleotide in the extended 
primer; 

evaluating the length of the oligonucleotide fragments to determine the 
positions of the type of base corresponding to the second type of chain terminating 
nucleotide in the extended primer; and 

comparing the positions of the type of base corresponding to the first and 
second types of chain terminating nucleotide in the extended primer to the positions 
found in known alleles of the gene whereby the sample can either be assigned as being 
of a particular type or is assigned as ambiguous for further evaluation. 

14. The method of claim 13, wherein the sample is ambiguous after 
comparing the positions of the type of base corresponding to the first and second types 
of chain terminating nucleotide in the extended primer to the positions found in known 
alleles of the gene, further comprising the steps of 

combining a third aliquot of the sample with a third sequencing reaction 
mixture containing a template-dependent nucleic acid polymerase, A, T, G and C 
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nucleotide feedstocks, a third type of chain terminating nucleotide, different from said 
first and second types, and a sequencing primer under conditions suitable for template 
dependant primer extension to form a plurality of oligonucleotide fragments of 
differing lengths, the lengths of said fragments indicating the positions of the type of 
base corresponding to the third type of chain terminating nucleotide in the extended 
primer; 

evaluating the length of the oligonucleotide fragments to determine the 
positions of the type of base corresponding to the third type of chain terminating 
nucleotide in the extended primer; and 

comparing the positions of the type of base corresponding to the first, 
second and third types of chain terminating nucleotide in the extended primer to the 
positions found in known alleles of the gene whereby the sample can either be assigned 
as being of a particular type or is assigned as ambiguous for farther evaluation. 

15. The method of claim 14, wherein the sample is ambiguous after 
comparing the positions of the type of base corresponding to the first, second and third 
types of chain terminating nucleotide in the extended primer to the positions found in 
known alleles of the gene, further comprising the steps of 

combining a fourth aliquot of the sample with a fourth sequencing reaction 
mixture containing a template-dependent nucleic acid polymerase, A, T, G and C 
nucleotide feedstocks, a fourth type of chain terminating nucleotide, different from said 
first, second and third type, and a sequencing primer under conditions suitable for 
template dependant primer extension to form a plurality of oligonucleotide fragments 
of differing lengths, the lengths of said fragments indicating the positions of the type of 
base corresponding to the fourth type of chain terminating nucleotide in the extended 
primer; 

evaluating the length of the oligonucleotide fragments to determine the 
positions of the type of base corresponding to the fourth type of chain terminating 
nucleotide in the extended primer; and 
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comparing the positions of the type of base corresponding to the first, 
second, third and fourth types of chain terminating nucleotide in the extended primer to 
the positions found in known alleles of the gene whereby the sample can either be 
assigned as being of a particular type or is assigned as ambiguous for further 
evaluation. 

1 6. The method of any of claims 1 2 to 1 5, wherein the sample is 
amplified prior to combining it with the sequencing reaction mixture to enrich the 
amount of the polymorphic genetic locus. 

1 7. The method of claim 1 6, wherein the amplification is performed 
using polymerase chain reaction amplification 

1 8. The method of any of claims 1 2 to 1 7, wherein the gene is an HLA 
Class I gene. 

1 9. The method of any of claims 12 to 1 7, wherein the gene is an HLA 
Class II gene. 
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E 


— 












L1 


— 


c c - - 


— 


c c - 


- c - - c — 


c--c c - 


L2 


— 












F 














G 














C 




C - - C 




c c - 


-C--CC- 


- c c 


A 




c - - C 




c c - 


-C--CC- 


-c c 


H 




c - - C 




c c - 


-c--cc- 


-c c - -c 


I 




C - -C 




C C - 


-C--CC- 




J 




c - - C 




c c - 


- c --cc- 


-c c 


K 




c - - C 




cc- 


-c--cc- 




L3 




cc-- 




c c - 


-c--cc- 


-c c - - c 



POSSIBLE G TERMINATION REACTION RESULTS 

256 78901 2345678901 23456789 280 

B G G 66-- 

BO G G G 66-- 

D G G G G G — 

E G G G 66-- 

L1 G G G 66-- 

L2 6 G G 66-- 

F G-GG G--G 6 -6 6 6 6 - 

G G-G6 G--6G-G 6 6 6 - 



♦ 



C G-GG-G G-G G 6 

A 6-66-6 G-G--6--6 

H 6-66-6 G--G--6 

I 6-66-6 GG--G--G 

J 6-66-6 6-6 6 6 

K 6-66-6 6-6 G — G 

L3 G--G-G G-G G 

FIG. 8C 
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